1. Introduction/ problem setting

Frame: we are thinkTank approached by policy makers from the immigration office with the task of informing how immigration was discussed in the HoC. - covid, uk entering hardship, immigration will become a sensitive topic, they want to be prepared. - they want to get what the uk relation to immigration is, in terms of intensity of discussions and in terms of sentiment -understanding topics and sentiment is important for coalition building - they have quantities data, but wish to understand better the discourse. for that, we came with the following guiding hypothesis: 1. 2. 3.

It’s 1978. Margaret Thatcher takes an interview in the notorious British programme World in Action in which she voices, what in her view, is the recent sentiment of The People regarding immigration. Remarkedly, she notes that “People are really rather afraid that this country might be rather swamped by people with a different culture.” Jumping in time. It’s 2010, David Cameroon wins BY A LOT the general election but nevertheless could not form a coalition, resulting eventually in broad coalition government. It’s 2015 now, an exceptional amount of asylum seekers make their way to European Union (EU), and the United Kingdom (UK) is facing many applications for asylum, double the amount it did just one year before (Riihimäki, 2016). A year ahead, the date is June 23, 2016 and the British decide to withdraw from the EU and make the UK GREAT AGAIN. All in all, things haven’t been easy for the UK.

Knowing that in part these described transformations were driven by discussions about “people with a different culture” elicits the need to understand how these people are discussed and framed, and whether these frames and sentiments change around specific events such as Brexit and the Migration wave of 2015. A good place to start answering this inquiry is by looking at the parliamentary speeches from the HoC debates because they are held by public elected members who, in principle, represent the interests and voice of their electors.Further, as the discussions in the House are instrumental to the unfolding policies and pieces of legislation regarding immigrants, understanding the views and frames voiced in the discussion may shed light on how these frames impacted resulting policy about immigration. In this sense, our text analysis could result in understanding better a treatment, which allows for induction of future research hypothesis.

2. Packages and data

explain which data has been used and forms the basis of our analysis – text analysis, what it is, text sentiment and topic and what can we get out of it.

To understand how immigrants are framed and perceived when discussed in parliamentary debates, we used a database called ParlSpeech V2 by Christian Rauh and Jan Schwalbach (2020). This database is unique in its scope, covering all parliamentary debates from 1998 and up until 2020, resulting in 1,956,223 speeches (Rauh & Schwalbach, 2020, p. 10). The text was collected from the digital Commons Hansard that contains the plenary protocols and documents from which speech texts and metadata are extracted. The corpus contains a range of covariates like party affiliation and agenda which facilitate better analysis of the various ways in which the topic of consideration is discussed by the different parties’ representatives and depending on the agenda context. For that end, we also leverage the (established to produce reliable estimates) Lexicoder 2015 sentiment dictionary that consists of 2,858-word patterns relating to negative sentiment and 1,709-word patterns, indicating positive sentiment (Young & Soroka, 2012).

2.1 Subset

explain the subset + limitations Choosing a unit for analysis is a challenging task, and in our case, the desicions we took were related both to substantive and practical consideration of needing to narrow down a very large database to perform a more in depth analysis. Thus, we choose to focus on texts from 2010 to present day. 2010 is a good starting point for our analysis because that was the year of the Tory manifesto and the general elections which resulted with a win for the Conservative party. This allow us for a sufficient time frame that has observations both before our main events of interest, namely the 2015 General Election, the migration wave and the Brexit Referendum, and after, from 2016 until 2020. In terms of content, we subset the corpus only to those speeches that contain a reference to key words related to the topic. Specifically, “immigra”, “refugee” or “asylum” because we expect parliamentary debates to be explicit in their language, meaning that if immigration is discussed one of these key words will show either in the agenda description or in the speech itself and therefore we think this method would allow us to capture most of the substantive debates regarding immigration (Van Dijk, 2000). This type of subsetting allows us to focus our analysis and remove noise from unrelated text, and yet, contain the limitation of not including any documents who discuss immigration without mentioning the three key terms chosen in either agenda description or text. Further, by this subsetting we are very likely to loose short responses to speeches carried out.

2.2 Foundational Dateframes & Considerations

mention that we will look at two basic subsets: One general one with all obrservations of the initial subset, and one based on the context of the keywords. Justify why.

mention events etc.

2.2.1 General Corpus

2.2.3 General consideration/definitions used across analysis

3. Descriptives

Justifications and throughts here.

Plot 3: Prevalence of immigration debates over time by month | Total number of words as a proxy for time spent on debating.

Concentration of party-specific contributions

This density plot gives us a sense of the frequency each party discussed each party discussed each month during the time frame of our research. Basically what it does it counts how many words each party each party invested in speaking about immigration related topics. So for example, while the SNP and the DUP spoke more about immigration after Brexit, other parties exhibit a more constant trend of engagement with immigration related speech. Importantly, the information that can be gathered from this graph is limited in that it does not tell us anything about substance of these speeches, but crudely how many words were used. Nevertheless, this descriptive visualization does help us get an initial sense about the prevelance of immigration related speech in each of the parties we are focusing on.

4. Sentiment

Sentiment | Overall Corpus

Graph 1: Overall Sentiment

Graph 2: Sentiment by party

## Warning: `group_by_()` is deprecated as of dplyr 0.7.0.
## Please use `group_by()` instead.
## See vignette('programming') for more help
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_warnings()` to see where this warning was generated.

5. Sentiment in Context

2.2.2 Create KWIC - Dataframe, Corpus and Dfm

Subset KWIC according to keywords

Sentiment Keywords in Context of keyword

Graph 3: KWIC sentiment #### I think this should also be discarded. It is kind of strange maybe to plot sentiment based on bubbles of 40 words at a time, or, we could justify it by assuming that in these bubbles speeches are likely to be really related to immigration which is also where most sentiment is likely to be voiced. But I am not that convinced by this myself.